{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "# Chapter 7: Building Chat Applications\n",
        "## Azure OpenAI API Quickstart"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Overview\n",
        "This notebook is adapted from the [Azure OpenAI Samples Repository](https://github.com/Azure/azure-openai-samples?WT.mc_id=academic-105485-koreyst) that includes notebooks that also access the [OpenAI](notebook-openai.ipynb) service.\n",
        "\n",
        "The Python OpenAI API works with Azure OpenAI as well, with a few modifications. Learn more about the differences here: [How to switch between OpenAI and Azure OpenAI endpoints with Python](https://learn.microsoft.com/azure/ai-services/openai/how-to/switching-endpoints?WT.mc_id=academic-109527-jasmineg)\n",
        "\n",
        "For more quickstart examples please refer to the official [Azure OpenAI Quickstart Documentation](https://learn.microsoft.com/azure/cognitive-services/openai/quickstart?pivots=programming-language-studio&WT.mc_id=academic-105485-koreyst)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Table of Contents  \n",
        "\n",
        "[Overview](#overview)  \n",
        "[Getting started with Azure OpenAI Service](#getting-started-with-azure-openai-service)  \n",
        "[Build your first prompt](#build-your-first-prompt)  \n",
        "\n",
        "[Use Cases](#use-cases)    \n",
        "[1. Summarize Text](#summarize-text)  \n",
        "[2. Classify Text](#classify-text)  \n",
        "[3. Generate New Product Names](#generate-new-product-names)  \n",
        "[4. Fine Tune a Classifier](#fine-tune-a-classifier)  \n",
        "[5. Embeddings](#embeddings)\n",
        "\n",
        "[References](#references)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### Getting started with Azure OpenAI Service\n",
        "\n",
        "New customers will need to [apply for access](https://aka.ms/oai/access?WT.mc_id=academic-105485-koreyst) to Azure OpenAI Service.  \n",
        "After approval is complete, customers can log into the Azure portal, create an Azure OpenAI Service resource, and start experimenting with models via the studio  \n",
        "\n",
        "[Great resource for getting started quickly](https://techcommunity.microsoft.com/t5/educator-developer-blog/azure-openai-is-now-generally-available/ba-p/3719177?WT.mc_id=academic-105485-koreyst)\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### Build your first prompt  \n",
        "This short exercise will provide a basic introduction for submitting prompts to an OpenAI model for a simple task \"summarization\".  \n",
        "\n",
        "![](images/generative-AI-models-reduced.jpg)  \n",
        "\n",
        "\n",
        "**Steps**:  \n",
        "1. Install OpenAI library in your python environment  \n",
        "2. Load standard helper libraries and set your typical OpenAI security credentials for the OpenAI Service that you've created  \n",
        "3. Choose a model for your task  \n",
        "4. Create a simple prompt for the model  \n",
        "5. Submit your request to the model API!"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### 1. Install OpenAI"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "  > [!NOTE] This step is not necessary if run this notebook on Codespaces or within a Devcontainer"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674254990318
        },
        "jupyter": {
          "outputs_hidden": true,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "%pip install openai python-dotenv"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### 2. Import helper libraries and instantiate credentials"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674829434433
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "import os\n",
        "from openai import AzureOpenAI\n",
        "import numpy as np\n",
        "from dotenv import load_dotenv\n",
        "load_dotenv()\n",
        "\n",
        "#validate data inside .env file\n",
        "\n",
        "client = AzureOpenAI(\n",
        "  api_key=os.environ['AZURE_OPENAI_KEY'],  # this is also the default, it can be omitted\n",
        "  api_version = \"2023-05-15\"\n",
        "  )"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### 3. Finding the right model  \n",
        "The GPT-3.5-turbo or GPT-4 models can understand and generate natural language. The service offers four model capabilities, each with different levels of power and speed suitable for different tasks. \n",
        "\n",
        "[Azure OpenAI models](https://learn.microsoft.com/azure/cognitive-services/openai/concepts/models?WT.mc_id=academic-105485-koreyst)  \n",
        "![](images/a-b-c-d-models-reduced.jpg)  \n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674742720788
        },
        "jupyter": {
          "outputs_hidden": true,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "# Select the General Purpose curie model for text\n",
        "model = os.environ['AZURE_OPENAI_DEPLOYMENT']"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## 4. Prompt Design  \n",
        "\n",
        "\"The magic of large language models is that by being trained to minimize this prediction error over vast quantities of text, the models end up learning concepts useful for these predictions. For example, they learn concepts like\"(1):\n",
        "\n",
        "* how to spell\n",
        "* how grammar works\n",
        "* how to paraphrase\n",
        "* how to answer questions\n",
        "* how to hold a conversation\n",
        "* how to write in many languages\n",
        "* how to code\n",
        "* etc.\n",
        "\n",
        "#### How to control a large language model  \n",
        "\"Of all the inputs to a large language model, by far the most influential is the text prompt(1).\n",
        "\n",
        "Large language models can be prompted to produce output in a few ways:\n",
        "\n",
        "- Instruction: Tell the model what you want\n",
        "- Completion: Induce the model to complete the beginning of what you want\n",
        "- Demonstration: Show the model what you want, with either:\n",
        "- A few examples in the prompt\n",
        "- Many hundreds or thousands of examples in a fine-tuning training dataset\n",
        "\n",
        "\n",
        "\n",
        "#### There are three basic guidelines to creating prompts:\n",
        "\n",
        "**Show and tell**. Make it clear what you want either through instructions, examples, or a combination of the two. If you want the model to rank a list of items in alphabetical order or to classify a paragraph by sentiment, show it that's what you want.\n",
        "\n",
        "**Provide quality data**. If you're trying to build a classifier or get the model to follow a pattern, make sure that there are enough examples. Be sure to proofread your examples — the model is usually smart enough to see through basic spelling mistakes and give you a response, but it also might assume this is intentional and it can affect the response.\n",
        "\n",
        "**Check your settings.** The temperature and top_p settings control how deterministic the model is in generating a response. If you're asking it for a response where there's only one right answer, then you'd want to set these lower. If you're looking for more diverse responses, then you might want to set them higher. The number one mistake people make with these settings is assuming that they're \"cleverness\" or \"creativity\" controls.\n",
        "\n",
        "\n",
        "Source: https://github.com/Azure/OpenAI/blob/main/How%20to/Completions.md"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "![](images/prompt_design.jpg)\n",
        "image is creating your first text prompt!"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### 5. Submit!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674494935186
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "# Create your first prompt\n",
        "text_prompt = \"Should oxford commas always be used?\"\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "  model=model,\n",
        "  messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n",
        "               {\"role\":\"user\",\"content\":text_prompt},])\n",
        "\n",
        "response.choices[0].message.content"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### Repeat the same call, how do the results compare?"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674494940872
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "\n",
        "response = client.chat.completions.create(\n",
        "  model=model,\n",
        "  messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n",
        "               {\"role\":\"user\",\"content\":text_prompt},])\n",
        "\n",
        "response.choices[0].message.content"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Summarize Text  \n",
        "#### Challenge  \n",
        "Summarize text by adding a 'tl;dr:' to the end of a text passage. Notice how the model understands how to perform a number of tasks with no additional instructions. You can experiment with more descriptive prompts than tl;dr to modify the model’s behavior and customize the summarization you receive(3).  \n",
        "\n",
        "Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something that current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. \n",
        "\n",
        "\n",
        "\n",
        "Tl;dr"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "# Exercises for several use cases  \n",
        "1. Summarize Text  \n",
        "2. Classify Text  \n",
        "3. Generate New Product Names\n",
        "4. Embeddings\n",
        "5. Fine tune a classifier"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674495198534
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "prompt = \"Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something that current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.\\n\\ntl;dr\"\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674495201868
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "#Setting a few additional, typical parameters during API Call\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "  model=model,\n",
        "  messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n",
        "               {\"role\":\"user\",\"content\":prompt},])\n",
        "\n",
        "response.choices[0].message.content"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Classify Text  \n",
        "#### Challenge  \n",
        "Classify items into categories provided at inference time. In the following example, we provide both the categories and the text to classify in the prompt(*playground_reference). \n",
        "\n",
        "Customer Inquiry: Hello, one of the keys on my laptop keyboard broke recently and I'll need a replacement:\n",
        "\n",
        "Classified category:\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674499424645
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "prompt = \"Classify the following inquiry into one of the following: categories: [Pricing, Hardware Support, Software Support]\\n\\ninquiry: Hello, one of the keys on my laptop keyboard broke recently and I'll need a replacement:\\n\\nClassified category:\"\n",
        "print(prompt)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674499378518
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "#Setting a few additional, typical parameters during API Call\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "  model=model,\n",
        "  messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n",
        "               {\"role\":\"user\",\"content\":prompt},])\n",
        "\n",
        "response.choices[0].message.content"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Generate New Product Names\n",
        "#### Challenge\n",
        "Create product names from examples words. Here we include in the prompt information about the product we are going to generate names for. We also provide a similar example to show the pattern we wish to receive. We have also set the temperature value high to increase randomness and more innovative responses.\n",
        "\n",
        "Product description: A home milkshake maker\n",
        "Seed words: fast, healthy, compact.\n",
        "Product names: HomeShaker, Fit Shaker, QuickShake, Shake Maker\n",
        "\n",
        "Product description: A pair of shoes that can fit any foot size.\n",
        "Seed words: adaptable, fit, omni-fit."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674257087279
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "prompt = \"Product description: A home milkshake maker\\nSeed words: fast, healthy, compact.\\nProduct names: HomeShaker, Fit Shaker, QuickShake, Shake Maker\\n\\nProduct description: A pair of shoes that can fit any foot size.\\nSeed words: adaptable, fit, omni-fit.\"\n",
        "\n",
        "print(prompt)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "#Setting a few additional, typical parameters during API Call\n",
        "\n",
        "response = client.chat.completions.create(\n",
        "  model=model,\n",
        "  messages = [{\"role\":\"system\", \"content\":\"You are a helpful assistant.\"},\n",
        "               {\"role\":\"user\",\"content\":prompt}])\n",
        "\n",
        "response.choices[0].message.content"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Embeddings\n",
        "This section will show how to retrieve embeddings, and find similarities between words, sentences, and documents. In order to run the following noteboooks you need to deploy a model that uses `text-embedding-ada-002` as base model and set his deployment name inside .env file, using the `AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT` variable."
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "### Model Taxonomy - Choosing an embedding model\n",
        "\n",
        "**Model taxonomy**: {family} - {capability} - {input-type} - {identifier}  \n",
        "\n",
        "{family}     --> text-embedding  (embeddings family)  \n",
        "{capability} --> ada             (all the other embedding models will be retired in 2024)  \n",
        "{input-type} --> n/a             (only specified for search models)  \n",
        "{identifier} --> 002             (version 002)  \n",
        "\n",
        "model = 'text-embedding-ada-002'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "  > [!NOTE] The following step is not necessary if run this notebook on Codespaces or within a Devcontainer"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Dependencies for embeddings_utils\n",
        "%pip install matplotlib plotly scikit-learn pandas"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674829364153
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "def cosine_similarity(a, b):\n",
        "    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674829424097
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "text = 'the quick brown fox jumped over the lazy dog'\n",
        "model= os.environ['AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT']\n",
        "client.embeddings.create(input='[text]', model=model).data[0].embedding"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674829555255
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "\n",
        "# compare several words\n",
        "automobile_embedding  = client.embeddings.create(input='automobile', model=model).data[0].embedding\n",
        "vehicle_embedding     = client.embeddings.create(input='vehicle', model=model).data[0].embedding\n",
        "dinosaur_embedding    = client.embeddings.create(input='dinosaur', model=model).data[0].embedding\n",
        "stick_embedding       = client.embeddings.create(input='stick', model=model).data[0].embedding\n",
        "\n",
        "print(cosine_similarity(automobile_embedding, vehicle_embedding))\n",
        "print(cosine_similarity(automobile_embedding, dinosaur_embedding))\n",
        "print(cosine_similarity(automobile_embedding, stick_embedding))"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "## Comparing article from cnn daily news dataset\n",
        "source: https://huggingface.co/datasets/cnn_dailymail\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674831122093
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "import pandas as pd\n",
        "cnn_daily_articles = ['BREMEN, Germany -- Carlos Alberto, who scored in FC Porto\\'s Champions League final victory against Monaco in 2004, has joined Bundesliga club Werder Bremen for a club record fee of 7.8 million euros ($10.7 million). Carlos Alberto enjoyed success at FC Porto under Jose Mourinho. \"I\\'m here to win titles with Werder,\" the 22-year-old said after his first training session with his new club. \"I like Bremen and would only have wanted to come here.\" Carlos Alberto started his career with Fluminense, and helped them to lift the Campeonato Carioca in 2002. In January 2004 he moved on to FC Porto, who were coached by José Mourinho, and the club won the Portuguese title as well as the Champions League. Early in 2005, he moved to Corinthians, where he impressed as they won the Brasileirão,but in 2006 Corinthians had a poor season and Carlos Alberto found himself at odds with manager, Emerson Leão. Their poor relationship came to a climax at a Copa Sul-Americana game against Club Atlético Lanús, and Carlos Alberto declared that he would not play for Corinthians again while Leão remained as manager. Since January this year he has been on loan with his first club Fluminense. Bundesliga champions VfB Stuttgart said on Sunday that they would sign a loan agreement with Real Zaragoza on Monday for Ewerthon, the third top Brazilian player to join the German league in three days. A VfB spokesman said Ewerthon, who played in the Bundesliga for Borussia Dortmund from 2001 to 2005, was expected to join the club for their pre-season training in Austria on Monday. On Friday, Ailton returned to Germany where he was the league\\'s top scorer in 2004, signing a one-year deal with Duisburg on a transfer from Red Star Belgrade. E-mail to a friend .',\n",
        "                        '(CNN) -- Football superstar, celebrity, fashion icon, multimillion-dollar heartthrob. Now, David Beckham is headed for the Hollywood Hills as he takes his game to U.S. Major League Soccer. CNN looks at how Bekham fulfilled his dream of playing for Manchester United, and his time playing for England. The world\\'s famous footballer has begun a five-year contract with the Los Angeles Galaxy team, and on Friday Beckham will meet the press and reveal his new shirt number. This week, we take an in depth look at the life and times of Beckham, as CNN\\'s very own \"Becks,\" Becky Anderson, sets out to examine what makes the man tick -- as footballer, fashion icon and global phenomenon. It\\'s a long way from the streets of east London to the Hollywood Hills and Becky charts Beckham\\'s incredible rise to football stardom, a journey that has seen his skills grace the greatest stages in world soccer. She goes in pursuit of the current hottest property on the sports/celebrity circuit in the U.S. and along the way explores exactly what\\'s behind the man with the golden boot. CNN will look back at the life of Beckham, the wonderfully talented youngster who fulfilled his dream of playing for Manchester United, his marriage to pop star Victoria, and the trials and tribulations of playing for England. We\\'ll look at the highs (scoring against Greece), the lows (being sent off during the World Cup), the Man. U departure for the Galacticos of Madrid -- and now the Home Depot stadium in L.A. We\\'ll ask how Beckham and his family will adapt to life in Los Angeles -- the people, the places to see and be seen and the celebrity endorsement. Beckham is no stranger to exposure. He has teamed with Reggie Bush in an Adidas commercial, is the face of Motorola, is the face on a PlayStation game and doesn\\'t need fashion tips as he has his own international clothing line. But what does the star couple need to do to become an accepted part of Tinseltown\\'s glitterati? The road to major league football in the U.S.A. is a well-worn route for some of the world\\'s greatest players. We talk to some of the former greats who came before him and examine what impact these overseas stars had on U.S. soccer and look at what is different now. We also get a rare glimpse inside the David Beckham academy in L.A, find out what drives the kids and who are their heroes. The perception that in the U.S.A. soccer is a \"game for girls\" after the teenage years is changing. More and more young kids are choosing the European game over the traditional U.S. sports. E-mail to a friend .',\n",
        "                        'LOS ANGELES, California (CNN) -- Youssif, the 5-year-old burned Iraqi boy, rounded the corner at Universal Studios when suddenly the little boy hero met his favorite superhero. Youssif has always been a huge Spider-Man fan. Meeting him was \"my favorite thing,\" he said. Spider-Man was right smack dab in front of him, riding a four-wheeler amid a convoy of other superheroes. The legendary climber of buildings and fighter of evil dismounted, walked over to Youssif and introduced himself. Spidey then gave the boy from a far-away land a gentle hug, embracing him in his iconic blue and red tights. He showed Youssif a few tricks, like how to shoot a web from his wrist. Only this time, no web was spun. \"All right Youssif!\" Spider-Man said after the boy mimicked his wrist movement. Other superheroes crowded around to get a closer look. Even the Green Goblin stopped his villainous ways to tell the boy hi. Youssif remained unfazed. He didn\\'t take a liking to Spider-Man\\'s nemesis. Spidey was just too cool. \"It was my favorite thing,\" the boy said later. \"I want to see him again.\" He then felt compelled to add: \"I know it\\'s not the real Spider-Man.\" This was the day of dreams when the boy\\'s nightmares were, at least temporarily, forgotten. He met SpongeBob, Lassie and a 3-year-old orangutan named Archie. The hairy, brownish-red primate took to the boy, grabbing his hand and holding it. Even when Youssif pulled away, Archie would inch his hand back toward the boy\\'s and then snatch it. See Youssif enjoy being a boy again » . The boy giggled inside a play area where sponge-like balls shot out of toy guns. It was a far different artillery than what he was used to seeing in central Baghdad, as recently as a week ago. He squealed with delight and raced around the room collecting as many balls as he could. He rode a tram through the back stages at Universal Studios. At one point, the car shook. Fire and smoke filled the air, debris cascaded down and a big rig skidded toward the vehicle. The boy and his family survived the pretend earthquake unscathed. \"Even I was scared,\" the dad said. \"Well, I wasn\\'t,\" Youssif replied. The father and mother grinned from ear to ear throughout the day. Youssif pushed his 14-month-old sister, Ayaa, in a stroller. \"Did you even need to ask us if we were interested in coming here?\" Youssif\\'s father said in amazement. \"Other than my wedding day, this is the happiest day of my life,\" he said. Just a day earlier, the mother and father talked about their journey out of Iraq and to the United States. They also discussed that day nine months ago when masked men grabbed their son outside the family home, doused him in gas and set him on fire. His mother heard her boy screaming from inside. The father sought help for his boy across Baghdad, but no one listened. He remembers his son\\'s two months of hospitalization. The doctors didn\\'t use anesthetics. He could hear his boy\\'s piercing screams from the other side of the hospital. Watch Youssif meet his doctor and play with his little sister » . The father knew that speaking to CNN would put his family\\'s lives in jeopardy. The possibility of being killed was better than seeing his son suffer, he said. \"Anything for Youssif,\" he said. \"We had to do it.\" They described a life of utter chaos in Baghdad. Neighbors had recently given birth to a baby girl. Shortly afterward, the father was kidnapped and killed. Then, there was the time when some girls wore tanktops and jeans. They were snatched off the street by gunmen. The stories can be even more gruesome. The couple said they had heard reports that a young girl was kidnapped and beheaded --and her killers sewed a dog\\'s head on the corpse and delivered it to her family\\'s doorstep. \"These are just some of the stories,\" said Youssif\\'s mother, Zainab. Under Saddam Hussein, there was more security and stability, they said. There was running water and electricity most of the time. But still life was tough under the dictator, like the time when Zainab\\'s uncle disappeared and was never heard from again after he read a \"religious book,\" she said. Sitting in the parking lot of a Target in suburban Los Angeles, Youssif\\'s father watched as husbands and wives, boyfriends and girlfriends, parents and their children, came and went. Some held hands. Others smiled and laughed. \"Iraq finished,\" he said in what few English words he knows. He elaborated in Arabic: His homeland won\\'t be enjoying such freedoms anytime soon. It\\'s just not possible. Too much violence. Too many killings. His two children have only seen war. But this week, the family has seen a much different side of America -- an outpouring of generosity and a peaceful nation at home. \"It\\'s been a dream,\" the father said. He used to do a lot of volunteer work back in Baghdad. \"Maybe that\\'s why I\\'m being helped now,\" the father said. At Universal Studios, he looked out across the valley below. The sun glistened off treetops and buildings. It was a picturesque sight fit for a Hollywood movie. \"Good America, good America,\" he said in English. E-mail to a friend . CNN\\'s Arwa Damon contributed to this report.'\n",
        "]\n",
        "\n",
        "cnn_daily_article_highlights = ['Werder Bremen pay a club record $10.7 million for Carlos Alberto .\\nThe Brazilian midfielder won the Champions League with FC Porto in 2004 .\\nSince January he has been on loan with his first club, Fluminense .',\n",
        "                                'Beckham has agreed to a five-year contract with Los Angeles Galaxy .\\nNew contract took effect July 1, 2007 .\\nFormer English captain to meet press, unveil new shirt number Friday .\\nCNN to look at Beckham as footballer, fashion icon and global phenomenon .',\n",
        "                                'Boy on meeting Spider-Man: \"It was my favorite thing\"\\nYoussif also met SpongeBob, Lassie and an orangutan at Universal Studios .\\nDad: \"Other than my wedding day, this is the happiest day of my life\"' \n",
        "]\n",
        "\n",
        "cnn_df = pd.DataFrame({\"articles\":cnn_daily_articles, \"highligths\":cnn_daily_article_highlights})\n",
        "\n",
        "cnn_df.head()                      "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "gather": {
          "logged": 1674831294043
        },
        "jupyter": {
          "outputs_hidden": false,
          "source_hidden": false
        },
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "outputs": [],
      "source": [
        "article1_embedding    = client.embeddings.create(input=cnn_df.articles.iloc[0], model=model).data[0].embedding\n",
        "article2_embedding    = client.embeddings.create(input=cnn_df.articles.iloc[1], model=model).data[0].embedding\n",
        "article3_embedding    = client.embeddings.create(input=cnn_df.articles.iloc[2], model=model).data[0].embedding\n",
        "\n",
        "print(cosine_similarity(article1_embedding, article2_embedding))\n",
        "print(cosine_similarity(article1_embedding, article3_embedding))"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "# References  \n",
        "- [Azure Documentation - Azure OpenAI Models](https://learn.microsoft.com/azure/cognitive-services/openai/concepts/models?WT.mc_id=academic-105485-koreyst)  \n",
        "- [OpenAI Studio Examples](https://oai.azure.com/portal?WT.mc_id=academic-105485-koreyst)  "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "# For More Help  \n",
        "[OpenAI Commercialization Team](AzureOpenAITeam@microsoft.com)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "nteract": {
          "transient": {
            "deleting": false
          }
        }
      },
      "source": [
        "# Contributors\n",
        "* Louis Li  \n"
      ]
    }
  ],
  "metadata": {
    "kernel_info": {
      "name": "python310-sdkv2"
    },
    "kernelspec": {
      "display_name": "base",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.13"
    },
    "microsoft": {
      "host": {
        "AzureML": {
          "notebookHasBeenCompleted": true
        }
      }
    },
    "nteract": {
      "version": "nteract-front-end@1.0.0"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}
